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Abstract 

This paper considers the multi-antenna multiple access relay channel (MARC), in which multiple users transmit 

messages to a common destination with the assistance of a relay. In a variety of MARC settings, the dynamic 
O ■ 

decode and forward (DDF) protocol is very useful due to its outstanding rate performance. However, the lack of 
' good structured codebooks so far hinders practical applications of DDF for MARC. In this work, two classes of 
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1/^ i structured MARC codes are proposed; 1) one-to-one relay-mapper aided multiuser lattice coding (O-MLC), and 2) 

modulo-sum relay-mapper aided multiuser lattice coding (MS-MLC). The former enjoys better rate performance, 
while the latter provides more flexibility to tradeoff between the complexity of the relay mapper and the rate 
performance. It is shown that, in order to approach the rate performance achievable by an unstructured codebook 
with maximum-likelihood decoding, it is crucial to use a new K-stage coset decoder for structured O-MLC, instead 

> 

•/^ ' of the one-stage decoder proposed in previous works. However, if O-MLC is decoded with the one-stage decoder 

m ; 

■ only, it can still achieve the optimal DDF diversity-multiplexing gain tradeoff in the high signal-to-noise ratio 

regime. As for MS-MLC, its rate performance can approach that of the O-MLC by increasing the complexity of 

o 

CO ' the modulo-sum relay-mapper Finally, for practical implementations of both O-MLC and MS-MLC, practical short 

length lattice codes with linear mappers are designed, which facilitate efficient lattice decoding. Simulation results 



I show that the proposed coding schemes outperform existing schemes in terms of outage probabilities in a variety 

of channel settings. 

I. INTRODUCTION 

In recent years, cooperative communication has drawn a significant amount of interest as a means 
of providing spatial diversity when time, frequency or antenna diversities are unavailable due to delay, 
bandwidth or terminal size constraints, respectively. Cooperative communication techniques for single- 
source networks have been extensively studied in terms of rate, outage probability or diversity-multiplexing 
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tradeoff (DMT) perspectives [[D [l2l [El. However, practical communication networks usually involve more 
than one source (user), leading to the study of the multiple-access channel (MAC). In this paper, we 
consider an important multi-user cooperative communication channel, that is, the multi-antenna multiple- 
access relay channel (MARC). The MARC is a MAC with an additional shared half-duplex relay [4J. 
It has been shown that the MARC provides a much larger achievable rate region 01 and diversity gain 
per user [|51, compared to those of the MAC. Also, since a single relay is shared by multiple users in 
the MARC, the extra cost of adding such a relay is acceptable. However, the code design for the MARC 
needs to jointly consider the codebooks of the multiple users and the relay [HI H [|71, and is thus not a 
trivial extension of those for the single-user relay channel or the multiple access channel. 

The achievable rate region of the MARC has been characterized in [[H [[6l and [[3. The decode and 
forward protocol, which is a special case of the dynamic decode and forward (DDF) protocol [|8l, was 
shown to achieve the capacity region of the MARC when the source-relay link is good enough [[71, thus 
having a larger achievable rate region than those of the multiple-access amplify and forward (MAF) [[51 
and compress and forward (CF) protocols [[9]. However, the capacity region of the general MARC remains 
unknown. The DMT for the MARC with single antenna nodes was studied in [[51 [[H and [[3. Although 
the MAF and CF are both DMT optimal in the high multiplexing gain regime [[51 [[H, compared with the 
DDF strategy, they both achieve lower diversity gains in the low to medium multiplexing gain regimes 
[[51 [[U. Moreover, in [|51, simulation results show that the DDF protocol yields a better outage probability 
than that of MAF and CF over a large range of signal-to-noise ratio (SNR), even at the high multiplexing 
gain regime. Thus we focus on the DDF in this paper due to its good performance in a variety of operation 
settings. 

However, previous results in [Sl-[[9l are based on unstructured random codebooks and maximum 
likelihood (ML) decoders, and are very difficult to implement in practice. In this paper, we propose 
structured multiuser lattice coding aided by a relay mapper for the MARC under the DDF protocol, in 
which each node in the MARC has multiple antennas. To simplify the joint codebook design problem for 
the multiple users and the relay, we introduce a relay mapper which selects the codeword to be transmitted 
at the relay to aid the users' transmissions. The relay mapper is a key new ingredient for our coding design, 
which can also help implement the unstructured codebooks in [[U, [[6l, [[3 and [[8l in practice, and does 
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not appear in [|4l-[|9l. However, the introduction of the relay mapper makes the decoding much more 
difficuh than that for the MAC [[TOl . We will see that the one-stage coset decoding proposed in [ilOl fails 
to achieve the rate performance of the unstructured codebook with the ML decoding demonstrated in [|3. 
Instead, we propose a new ^-stage coset decoder that achieves the rate performance in [|7] by successive 
cancellation on the multiuser decoding tree. Two classes of relay mapper aided multiuser lattice coding 
are proposed: 1) one-to-one relay mapper aided multiuser lattice coding (0-MLC), and 2) modulo-sum 
relay mapper aided multiuser lattice coding (MS-MLC). The first enjoys better rate performance while 
the second provides more flexibility to tradeoff between the complexity of the relay mapper and the rate 
performance. With the ^-stage coset decoder, the structured 0-MLC can achieve the rate performance 
obtained by the unstructured codebook in [|71. If only one-stage coset decoding is used, we also show 
that 0-MLC is DMT optimal for the DDF, and has better DMT than that in ^ and [H for the low to 
medium multiplexing gain regime. As for MS-MLC, when the codomain size of the modulo-sum relay 
mapper becomes larger, the error performance of MS-MLC approaches that of 0-MLC. Moreover, our 
decoder is no longer a simple lattice decoder as that of [[TOl , since the lattice structure for decoding may 
be destroyed by the relay mapper. Further, a naive application of the theoretical error analysis in [fTOl 
suffers from significant losses in prediction of the achievable rates of proposed coding. We overcome this 
problem by introducing a new technique for bounding the error probability over the random relay-mapper 
codebook ensemble. Finally, to implement our theoretical results, we construct practical lattice codebooks 
with linear mappings for both 0-MLC and MS-MLC, which enable the decoder to use the efficient lattice 
decoding algorithms in [fTTTl and f[T2l . 

Compared with codes appearing in previous works 0], [|6]-[Q which are difficult to implement, our 
structured MARC coding can be implemented in practice as we will see below. Some practical MARC 
code designs were proposed in [[T3l and [[T4l . but these studies lack theoretical performance analysis. In 
[fT3l and [[T4l . an orthogonal protocol was used in which users and the relay must transmitted in different 
time slots to avoid interference, while our scheme allows them to transmit simultaneously. Moreover, in 
[fT4]| . instead of joint code design, the relay's transmitted symbol is formed from the users' symbols with 
a simple transformation. Due to the above reasons, there are significant losses in the achievable rates and 
DMTs for the methods in [fT3l and lfT4l . compared with our schemes. In simulations, we show that our 
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Fig. 1. Dynamic decode and forward (DDF) for the A^-user multiple-antenna multiple-access relay channel (MARC), where Phase 1 is the 
relay's listening phase while Phase 2 is the relay's transmitting phase. 

proposed lattice coding schemes also outperform the schemes in |I51 lIH [[T3l and [[Ml in terms of outage 
probabilities. 

The rest of this paper is organized as follows. Section [ll] introduces the system model and some 
frequently used notation is summarized in Table [H In Section [nil 0-MLC and MS-MLC are introduced. 
In Section [IVl we establish the achievable rate region for both 0-MLC and MS-MLC and show that 
0-MLC is DMT optimal. In Section [Vl simulation results are presented, and Section [VI| concludes the 
paper. 

II. System Model 

We consider the ^-user multiple- antenna MARC as shown in Fig. [T} in which a relay node is assigned 
to assist the multiple-access users in transmitting data to a common destination. Each user and the relay 
is equipped with Mu and Mr antennas, respectively, and the destination has A'^ antennas. In the DDF for 
MARC, each codeword spans L slots each consisting of T vector symbols, and the block of LT vector 



symbols is split into two phases due to the half-duplex constraint at the relay node (i.e., it cannot transmit 
and receive simultaneously). In Phase 1, the relay receives the signals from the users, then it tries to 
decode the users' messages until the decision time i.\T . Following llH, l\T is chosen to be the earliest 
time index such that after ^iT symbols, the relay can decode the users' messages without error. If there 
is no such l\ E {1, ...,L— 1}, the relay remains silent. Let the x M„, x M„ channel matrices from 
user / to the relay and the destination be H^, and H^,,, respectively, which are perfectly known at the 
corresponding receivers. For Phase 1, the received Mr x 1 vector of symbols at the relay isC 



yr,i 




K 



M EH,,,x,./+n/, l = l,2,...,iiT 



(1) 



where is the received SNR at the relay, X; / is the Mu x 1 vector signal transmitted by user / at time 
index /, and the noise at the relay n/ ~ 6'iV^(0,lMj is a Gaussian vector with independent and identically 
distributed (i.i.d.) entries. Similar to ([T}, the received vector symbols at the destination in Phase 1 is 




K 



(2) 



where is the received SNR and V/ ~ CO^Ci^-, ^n) is the noise vector at the destination. In Phase 2 of DDF, 
based on the decoded messages obtained at the decision time (.\T , the relay transmits the corresponding 
coded vector symbols to the destination. The signal received by the destination is then 



yd-, A 




K 



" 1=1 



(3) 



where ^k+\^i denotes for the signal transmitted by the relay and M^x+i is the channel matrix from the 
relay to destination. As for the (normalized) MARC input power constraint, it is imposed on each user 
and the relay as 



1 

Elx/,/i 



LT 



1=1 



<Mu, 
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LT 



LT 



<M„ i=\,...,K 



(4) 



where the expectation £ [ ] is taken over all codewords in the codebook. 

* Notation : Let A be a set, then A* = A \ {0}. A*^ denotes the complement of A, and |A| denotes the cardinahty of A. For a matrix M, M'^ is 
the conjugate transpose and |M| is the determinant. We use log(-) for the logarithm with base 2, and x for the direct product. An ;i-dimensional 
real lattice A is a discrete additive subgroup of R". The lattice quantization function is defined as Qf^{y) = argmin;^^gy^ ly^^l for y £ ^\ and 
the modulo-lattice operation y = y mod A = y — 2A(y) 1151 . The second-order moment of A is defined as CT^(A) = „y^.'j'^) Jt/^ x^rfx, where 
1^ and V/(A) are given in (T1.2) and (T1.3) in Table |l] respectively. Some other frequently used notation is also summarized in Table U 



To simplify the presentation for the proposed lattice coding scheme, it is useful to transform our received 
signal model ([T]), ^ and (|3]) into the equivalent real channel model form as in (|5]) and Q, for the relay 
and the destination, respectively, 

y relay ^relay^relay ~l~ ^relay (5) 
ydst = ^dst'^dst + ^dst ■ (6) 

The equivalent channel for the destination (|6]) is formed by concatenating the received signal (|2]l in Phase 
1 and ([3]) in Phase 2, and the 2{KMu + Mr)LT x 1 super signal vector x^^^ in Q is 

Xrfif = [xf , ...,x^_i_j] , (7) 

r 1 ^ T 

where x,- = {xf j}^, {xf^j^}^ with xfj = [Re{x,- /}^,Im{x/ /}^] ; while the 2NLT x 1 super received 
signal and noise at the destination y^st and n^st in ® arc similarly defined respectively. The 2NLT x 
2{KMu+Mr)LT super-channel matrix Ha,, in ^ is Hdst = [Hf , ...,H^_^i] , where the 2NLT x2MuLT 
equivalent channel matrix Hf for user / comes from ^ as 




LT ' 



(8) 



Re{Hrf,,} -Im{Hrf,,} 
\^Im{Hrf,,} Re{H^,,} 

where denotes the Kronecker product and i = l,...,K, while the equivalent channel matrix Hx+i for 
the relay comes from dSj as 



H 



K+l 



diag 



^Re{lld,K+i} -Im{Hrf,/f+i}^ ^ 



(9) 



if 1 < £i < L — 1, where the first 2NiiT x 2Mr(-\T is a zero matrix because the relay is listening in Phase 
1 (if (.\ — L, H^^j = 02NLTx2M,.LT since the relay is silent). As for the equivalent channel for the relay (|5]), 
it can be similarly obtained from ([T) as above, with the dimensions of Hreiay being 2MrLT x 2KMuLT . 
We consider two kinds of channel settings, the fixed channel and the slow fading channel. In the fixed 
channel setting, the channels are deterministic and we use the achievable rate as a performance metric. 
For the slow fading channel, H^^^ and Hfeiay are random but remain constant over the whole code block. 
Since the MARC cannot support any non-zero rate pairs with vanishing error probabilities now, we use 
the DMT or the outage probabilities as performance metrics. The entries of the channel matrices are 
assumed to be i.i.d. CHjO, 1) when they are slow faded; i.e., we assume Rayleigh fading in this case. 
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III. Proposed Relay-Mapper Aided Multiuser Lattice coding schemes 
In this section, we specify the proposed multiuser lattice coding schemes for the MARC, i.e., 0-MLC 
and MS-MLC. Each of 0-MLC and MS-MLC consists of three building blocks: 1) the relay mapper 
which decides which codeword to be transmitted at the relay, 2) Loeliger-type nested lattices for the 
users' and the relay's codebooks and 3) a ^-stage coset decoder, which generalizes the one-stage decoder 
of IfTOll . We first briefly introduce the adopted lattice codebooks. Tailored for them, the relay mappers, 
the one-to-one mapper \\f""^ and the modulo-sum mapper \\f'""'^, for 0-MLC and MS-MLC, respectively 
are shown in Section IIII-B[ Then the whole encoding/decoding blocks are introduced in Section IIII-CI 

A. Loeliger-type Nested Lattice Codebooks 

In our code construction, codebooks of the z-th user (1 <i<K) and the relay (z = ^ + 1) are generated 
from Loeliger-type nested lattices. To be specific, we introduce the following definitions. 

Definition 1 (Self-similar nested lattice code): For user z, let Aq be a 2Mj(Lr-dimensional coding lat- 
tice and A5. C Aq be the shaping lattice. The nested lattice codebook is defined as C'^'^^' = {c,- : c, = c, 
mod A5,.,c,- G Aq}, where c,- are the coset leaders [fTSl of the partition Ac J As- (the set of cosets of A^^. 
relative to Aq). The codebook size is \C"^^'\ = 2^'^^, where the code rate is Rj bits per channel use 
(BPCU). When A5,. = (2^'/2^")^c,- where (2^'/2^«) G N is the nesting ratio, the nested lattice code Cf""'' 
is called a self-similar nested codell 

For a Loeliger-type nested-lattice ensemble, the coding lattice Aq for user z is randomly chosen from the 
Loeliger lattices ensemble which is generated from linear codes Cf" [17] . The detailed definition is given 
in Definition [5] in the Appendix lAl-(I). 

The codebook for the relay is generated similarly as above but with dimension IMyLT . 

B. Proposed Relay Mappers 

The relay mapper \\r is used to select the codeword (coset leader) Ck+i to be transmitted from the relay 
(transmitter ^+ 1) according to the codewords (coset leaders) c,, z = 1, of the K users. In other words, 
by concatenating the total ^-|- 1 codewords as a super one c = [(cf , . . . ,c|^),c^_|_J^ = [Cj^,cf]^((T1.5) in 
Table ID), then \\r{Cu) = c^. Now we introduce the proposed mappers. The first one is as follows. 

Definition 2 (One-to-one mapper): The one-to-one mapper \\t^"'^ : C'^'^''" —j- C""^^' for 0-MLC is a one-to- 
one bijective mapping that maps coset leaders in the super-codebook of users Q'*^'^' to the relay codebook 

^ Our results can be easily generalized to the case in which good (but maybe not self-similar) nested codes as in liT6l are used. 
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TABLE I 

List of Frequently used notation 



Notation 



Definition 



Description 



(TLl) 



(TL2) 

(TL3) Vf{h) 
(TL4) V,- 



(TL5) v,„ V, 

(TL6) C^f 

(TL7) Ac„, As., 

(TL8) Ac,, A5^ 

(TL9) Ac,,,, As„, 

(Tl.lO) V, V/ 
(Tl.U) p/,Y/ 

(TI.12) y""'' 
(TL13) c;;"', cf"' 

(TL14) Yl"-', 



(TL15) {Cr 



(TL16) O'f'i 
(TL17) M'^ 



«-dimensional finite field over Xp = {0,1,..., p— 1}, Prime p finite field 
where p is a prime 

The set of v £ R" closer to than to any other A. e A, Voronoi Region 
for a lattice A 

Volume of Voronoi region 1^ in (T1.2) Fundamental Volume 

Hi X 1 vector v,- e A; consists of the elements of v in Vector for transmitter where 1 < / < 



A/, where v= [vj^, . . . , vJ^J-^ is [Y^f^y "ij x 1> and 
A; is transmitter I's lattice (coding or shaping) 



K correspond to the users while i = K + 
1 corresponds to the relay 

v« = [vj , v^]^ , Vr = v^+i, with V/ defined in (T1.4) Super- vector for all users, and vector 

for relay 

Cj"" X • ■ • X C^"^;, where Cf° is the Loeliger linear code Super Loeliger-linear-code of users and 

relay 

Super-coding and shaping lattices of 
users 

Super-coding and shaping lattices of 
relay 

X J^Sk+\ Super-coding and shaping lattices of 

users and relay 
Modulo lattice operation 
Loeliger lattice ensemble parameters 
One-to-one Mapper, Modulo-sum 
Mapper 

Users' Codebooks, Relay's Codebook 
Differential mapper for one-to-one and 
modulo-sum mapper 
Differential codewords in ensemble 

Differential ambiguity cosets 
Matrix for users in set S 



for transmitter / as in Definition [5] 
Ac, X ••• X Ac^, As, X ••■ X As^ 



Ac, X ■■■ xAc^^,, As, X ■■ 

V modAs„,, V/ mod As, 
Definition |5] 
Definition H [3] 



Definition |2] 

: Yr = d,(w), Yl"'' ■■ vr'' (^"(w)) = 

d,.(w) 

d 

Matrix M'^ = [M/, , ...,M,|^|] is formed from M = 
[Ml , ...,Mjfj^], where Km is the number of the subma- 
trices of M, 5 = {I'l , i'lsi }, 1 < n < ■ • ■ < i\s\ < Km 



(T1.18) R' 



(T1.19) Cd(HL«,) 
(T1.20) d(r) 



(T1.21) Ip 
(T1.22) -Zp_ 

(T1.23) 72 



{S,K+\} 
dst 



3 log \^2{\S\M„+M,.)LT + (^H 
h^Og\^m„LT+{Klayy^. 



H 



{S.K+\}, 
dst I 



^ I 
relay I 



The diversity gain lim '"p^p^*^^ given a certain mul- 



tiplexing gain r, where Pe{p) is the probability that 
not all users are correctly decoded, p is the received 
SNR,andr=[n,...,rd with r,- ^ ^lim |^ andiJ,(p) 

is the transmission rate of user ; 

Apply componentwise modulo p operation on z 



(^l.WJ' P = iPU-,PK+l) 



I'-l , ■■■!'-K+ii 

[Yiz[,...,yA:+izJ^i]^, for y 

1*1 ^K+\\ 



{yu-,yK+i), 



Rate constraint at the destination using 
unstructured Gaussian codebook 

Rate constraint at the relay using un- 
structured Gaussian codebook 

Diversity and multiplexing tradeoff 
(DMT) 



Modulo p 
Modulo vector p 

"Vector" Hadamard product 



Instead of Pe{p), the outage probability is used for the calculation of DMT of the relay node in the DDF (3), (8) 
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or'. Here C^' ^ {c„ : c„ = (c„ modAs„),c„ G AcJ and C"/'' ^ {c, : c, = (c, modA5j,c, G Ac,}, 
where A^^ and Ac„ are defined in (T1.7) wliile A5,. and Aq are defined in (T1.8) in Table HI 

Note that jC"*^'^'! = \C^^'^'^'\ since the aforementioned mapping is bijective. The one-to-one relay mapper 
may require high complexity as the size of super-user codebook \C'^'^^'\ becomes large. To reduce the 
complexity of the mapper, we introduce another mapping ^if'™"^ , where the modulo-sum operation is 
performed at the relay, which is motivated by the XOR operations in network coding [[TSl . 

Definition 3 (Modulo-sum mapper): The modulo-sum mapper \|/""^'^ : Q""^ — t- C"'^^' for MS-MLC is 
defined as \|/"'"^(c„) = Y.f=iVr'^ i^i) mod A5^, where \\f'P"^ : C"*^'^' C"^-^' is an injective mapping for 
user / with nested user codebook C"^^' given in Definition [H while Q^^^^ and C"^^' are given in Definition 
121 

Note we require that \C"'^''^'\ > maXj{\Cf'^''"\} to ensure that the mapping in Definition [3] is injective. 
The domain dimension of \|/""'^ is at most max,{|C"'^'^^|} while that of the one-to-one mapper \\f"''"^ is 
Y[f=i \C"'^''"\, and \|/"''"^' has less complexity compared with However, the one-to-one mapper \\r""^ 

ensures that two different users' super-codewords are mapped to different codewords at the relay, and 
results in better error performance. In contrast, it is possible that two different super-codewords map to 
the same codeword of the relay due to the modulo-sum operation in \\r'""'^, and ambiguity occurs while 
decoding. 

C. Encoders and Proposed K-stage Coset Decoders 

1 ) Encoders at the K transmitters and the relay: User / selects the codeword C/ according to its message 
Wi from the codebook described in Section IIII-Al and sends signal into the MARC (E])-® (cf. ^) 

X,- = ([c/-u/] modAs,.) (10) 

where is a dither signal uniformly distributed over the Voronoi region 1/^^ of the shaping lattice A5^. 
((T1.2) in Table IB- From |fT9l , due to the dither u,, x, is uniformly distributed over and independent 
of C/. To meet the input power constraints dD as in [[T6l , we let the second-order moment of the shaping 
lattice o2(As,.) = 1/2. 

As for the relay (transmitter ^+1), it will first decode the users' messages, using the operation 
introduced below. Then the relay selects its codeword Ca:+i according to the decoded transmitted codewords 
c,s using the mappers in Section Hll-Bl and then transmits xk^i as in (flOl) with the power constraint dD. 
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stage 




k = \ 



k = 2 



k = K = 3 



Fig. 2. The multiuser decoding tree for the A^-stage coset decoding in Table HH with K = 3. Here for each node, the label {k,j) denotes the 
7-th node from the left at the k-th stage (Node_stage in Table Ull). while the number i inside a circle denotes the index ; of the user assumed 
to have been correctly decoded at the previous stage (Node_user in Table Hit. For example, when the coset decoding in Table HH is performed 
at node (2, 1) (the leftmost child node of the root node), user 1 is assumed to have been correctly decoded. The path from root node (1, 1) 
to node (3, 1) is illustrated with bolder lines. 

2) K-stage coset decoder: We first introduce the decoder at the destination, which generalizes the single 
stage coset decoder in [[TOl to the multi-stage one. The coset decoder disregards the boundaries of the 
codewords and avoids the complicated boundary control [[T2l . which allows for significant complexity 
reductions compared to ML decoding. Moreover, it facilitates the efficient sphere decoding algorithm 
IfTTTl . [fT2l . To decode messages from the received signal y^^^ in Q, the proposed /T-stage coset decoder 
works as in Table |ll] with the detailed steps explained as follows. 

According to Table Ull the decoder first generates the decoding tree as in Step A. An example for ^ = 3 
is given in Fig. |2l The decoder will traverse nodes from stage 1 to ^ in the tree, and produce the candidate 
codewords. We take the root node in Fig. |2]as an example to explain Steps B.l and B.2 in Table [III We 
use the notation c(Wf ) to represent the super-codeword for the ^ + 1 transmitters corresponding to the 
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TABLE II 

Algorithm of the ^-stage coset decoding for the Destination^ 



A. Generation of the decoding tree: 

Initialization: For the root node, Node_user= empty, Node_stage= 1 
lor k=\:K-\ 
for each node with node_stage= k, 

generates [K — k+ I) child nodes for the next stage (Node_stage= ^+ 1), 
for the child nodes from left to right, 

Node_user are assigned from the set { 1 , . . . , A"} \ 5 in increasing order, 
where S = {j '■ Node_user = J, for the ancestors** of child node} 
end 

B. K-stage candidate generation via coset decoding: 
for k=\:K 

Step B.l: For the node {k,j), let y^j/' = yt/jr ^ L^g^yC /) Hf x,-, where 5p*'''' is the set of previously-decoded users / along the 
path from root node to node {k,j), x/ is the transmitted signal (from l llOt ) corresponding to previous-decoded user fs message, 
and the channel is formed from (|8}. 

(For example, for the path starting from root node to node (3,1) in Fig.[2l the set is {1,2}.) 

Step B.2: Decodes the users' messages in the residual user set 5''^'-'' = {\,....K}\S^p ''^^ by coset decoding 

cik-j) = ai-gmin,gof.(M M(*^J') (cC^'^')), 

where 

m(M(c(M) = (u^,;) _c(M) p. (T2.1) 

Let = 2({K — k+ 1)M„ +Mr)LT and A'' — 2NLT, the x 1 c^^' -'' is formed from the super-codeword c by collecting all 
c,- e Aq of transmitter / ((T1.4) in Table Ul where / e 1}; the dither signal u'*'-'' is formed from u similarly; the 

m^. X N' f|^^^''' and raj. x m^. ^[j^P are the corresponding MMSE -GDFE filters for c^^'-'' ; and the searching cosets formed by 
previously-decoded users ; e Sp is 

Q^V.{kJ) = {c : c e O'l', (c- mod As, ) = (cf mod As, ) , / e }, 
where 0*f is given in l ll2b . c,- £ A^, ((T1.4) in Table Ul where Aq is transmitter ;'s coding lattice, and mod As, ) is 

the codeword (coset leader) of the previously-decoded user ; where ; e Sp ■ 
end 

C. Candidate elimination: 

For node {K, j) at the final stage K, combine the decoded messages to produce the K x I super-message w'*^ ''' as 

the candidate at node {K.j). The decoder searches for all A"! candidates wj*^ ''' and declares the one such that Hj5,x(*^ -'' is 

nearest to the received signal y^;,, as the final decoded message, where x'''-'^^ is the transmitted signal according to message wj*^''' 



The algorithm for the relay can be identically obtained by ignoring the relay's codewords. 
** The ancestors of a node are all the nodes along the path from the root to that node (not included). 
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1 transmitted message vector = [(w,)i, . . . , (Wf)^:]'^, where (w^),- denotes the transmitted message 
for user /. For the root node (the first stage coset decoding), with received signal y^st at the destination; 
the decoder output c according to (flOl) is 



c = argminM(c), with M(c) = |Frf,,yrf,, +Brf,rt(u-c)p, (11) 

where F^^t and 6^^.^ are the forward and feedback filters of the minimum mean- square error (MMSE) 
estimation generalized decision feedback equalizer (GDFE) as defined in IfTOl and |fT6l respectively; 
u= [uf , ...,u^_|_i]^ and the decoder searches points in the cosets (see (fT2l) ) of all c(w) (defined 
similarly to c(Wf) above), w G W, with W being the set of all possible messages: 

0V4{cGAc„,:(c modA5„,)=c(w),wG'M^}, (12) 

where the super-lattice of users and the relay Aq,. is defined in (T1.9) of Table HI The decoded message 
Wf is declared if c(Wf) and the decoded c from (fTTI) belong to the same coset, c mod A5^j^ = c(w/). For 
the node {k,j) in the decoding tree (the j-th node from the left at k-th stage) we consider a path from 
the root node to node (kj). An example for (kj) = (3, 1) is given in Fig. [21 In Step B.l of Table HH the 
decoder assumes that all the users at the nodes along the path (users 1 and 2 for the example path in Fig. 

have already been successively decoded (not necessarily correctly), and subtract the corresponding 
transmitted signals from the received signal y^st- Then the decoder decodes the remaining transmitted 
messages by (T2.1) in the Step B.2 of Table HI] (which corresponds to (fTTI)). Finally, as in Step C, the 
decoder searches for all K\ candidates produced at the nodes at the K-th stage (instead of all 2^^^^i^' 
codewords) to choose the final decoded message. 

The decoder at the relay also uses (fTTI ) as the criterion to decode messages from yreiay in © with the 
corresponding MMSE-GDFE forward and feedback filters. The main difference is that now the decoding 
does not make use of the relay codebook, and the decoder searches in the super-lattice of users A^^ instead 
of the coset in ([111). The complexity of the decoder in Table [III is about ( T.k=i ^T^TT)! 

0{ml) + 

KlO{ml)^ = 0((Lr)3)*, where rrik = 2{{K - k+ \)Mu+Mr)LT . It is much smaller compared with the 
complexity of the ML decoder 0(2^^^£i^'), which grows exponentially with the block length LT. 

^ According to our practical design in Section|V] one can use a linear mapper to implement O-MLC and MS-MLC. Then the nij^ -dimensional 
coset decoder can be implemented by the sphere decoding algorithm in 1121 with complexity rougly being O(m^). 
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Note that since the super-codewords have to satisfy the relay mapping rule (which may not be linear) 
in Section IIII-Bl the set is not necessary a sublattice of Ac„^. This makes (fTTI) different from the 
MMSE-GDFE lattice decoder in [lOJ and [[T6l . Without the algebraic structure of a lattice, the upcoming 
error probability analysis in the next section, and the design of practical decoding algorithms for the 
simulations in Section |V] will be much more difficult than those in [|TOl . 

IV. Performance analysis of the proposed coding schemes 
In this section, we establish the achievable rate regions for the MARC defined in (|5]) and using 
the proposed 0-MLC and MS-MLC for a fixed channel matrix, respectively. We show that the rate 
performance, which was originally achieved by using an unstructured random codebook in [|7], is now 
achieved by our structured 0-MLC. The key is using the ^-stage coset decoder which performs successive 
cancellation on the multiuser decoding tree, thus avoiding the rate loss incurred by the one-stage coset 
decoder in [[TOl . The rate loss due to use of a one-stage coset decoder is derived in Corollary [T] However, 
in Corollary |2l we show that the rate loss is relatively small in the high SNR regime, and structured 
0-MLC with the one-stage coset decoder achieves the optimal DMT for the MARC in ([5]) and Note 
that the DMT was achieved by an unstructured random codebook and ML decoding in (Si. For MS-MLC, 
we show that it can approach the rate performance of 0-MLC by increasing the relay's codebook size, 
and thus can tradeoff between the rate performance and complexity. 

In the error analysis of the proposed schemes, the conventional approach tailored for ML decoding [|51 
(HI and [|20l fails in predicting the performance of the coset decoder in (fTTT l due to the infinite number 
of points c G where the set is defined in (fT2l) . To solve this problem, from (fT2] ). we define the 
differential ambiguity cosets for the event that the transmitted message is erroneously decoded as w as 

O^^ = {d e Ac,,, : d = d(w),w G 'W,w^ w,}, (13) 

where the differential codeword d(w) = (c(w) — c(w/) mod A^,,,) with A5,,,. given in (T1.9) of Table Hand 
the vector after modulo operation d is defined in (Tl.lO). From the closure property of lattice addition, 
d(w) G Ac,,,. Moreover, O^^ is not a direct product of ^-|- 1 lattices (i.e., Ac,,,), and thus the techniques in 
IfTOl fail to predict the error probability of 0-MLC. We propose a new error probability upper-bound which 
avoids directly counting points of O^^ in the decision region of the decoder as this kind of evaluation is 



14 



intractable. Please see the upcoming Lemma \T\ presented in the proof of Theorem \T\ and the discussions 
after it. 

Besides providing the aforementioned new proof techniques, we also show that there will be a rate loss 
due to the one-stage coset decoding in ifTOll . The loss can be circumvented with the proposed ^-stage 
coset decoders by letting the decoder successively cancel the previously decoded messages. We show 
that in our multiuser decoding tree as in Fig. |2] there exists at least one path at each stage of Step B 
of Table HI] on which the previously decoded messages are correct. Then we can at least obtain a better 
decoder for the remaining users in the next stage to improve the error performance. To show that we can 
always choose the correct codeword from the candidates at the final stage in the decoding tree, we use 
a suboptimal decoder instead of the optimal one in Step C of Table HI] to complete our proof. Note that 
our decoder is different from the successive MAC decoding studied in 1211 . where the decoder is based 
on ML decoding and the previously decoded messages are correct. 

Now, we are ready to derive the achievable rate region of (JS) and using 0-MLC as follows. 

Theorem 1: For the MARC in Q and Q, the DDF rate region in (IT4] | and (fTST ). which is achieved by 
unstructured Gaussian codebooks and ML decoding in JjTl, is achievable by the structured 0-MLC and 
the A'-stage coset decoder in Table Jill where the rate constraints at the relay and destination are 

I^^<7V<S^(HLj,and (14) 

<^RinGi^lif^% ysc{i,...,K} (15) 

respectively, with Rf^ci^fsf^^^) ^^d (H^ay) gi^^^^ '^^ (T1.18) and (T1.19) in TableHJ The channel 
matrix from the users in the set S and the relay to the destination H^^^ ' is formed from H^^ = 
[Hf , . . . ,H^_|_j] as in (T1.17) with Hf given in ^ and and the channel matrix from the users in the 
set S to the relay H^g/^,^, is defined similarly to H^^^''^^^^ 

Proof: We prove only (fTST i here since (fT4] | follows similarly. First, for the first stage {k= I, the root 
node of Fig. l2|) of the candidate generation process in Step B of Table Hll we show that at least one of the 
users' messages is correctly decoded in the generated "super"-message w|''^^ of all users (with probability 
1) as r — 7- oo. To do this, we first define the following error event. 
Definition 4 (set-S error): A decoded super-message w is with set-5 error if the message in w for every 



15 



user i, where / G S, is different from the corresponding transmitted message. That is, w,- 7^ (Wf),-,Vz G S, 
while w,- = (wr),-, otherwise. 

Let PeiSlHdst) be the probability that there exists w with set-5 error with fixed K^st, and minc£o(w)M(c) < 
rniiiceo(w,) M(c), with M(c) defined in (fTTI) and o(w) being the coset of w. To validate our claim, we first 
consider the erroneous user set S^^^ = {I7 ■■■^K} and will prove that Pe{S^^^\ildst) for the first-stage, 
if the transmission rates Rj satisfy (fTSl ) and the lattice codes are good as defined in the upcoming Lemma 
[B Here Pe{S^^^\Hdst) is averaged over the random relay-mapper and linear-code ensemble "J^ci-o = 
{\\r""^ ^Cj^/j^} consisting of all possible one-to-one mappers \\t^"'^ and Loeliger linear codes Cj^" of the users 
and relay ((T1.6) in Table U). 

Lemma 1: For 0-MLC, let 7?,, i= 1, ...,^+ 1, be the code rates for the users and the relay, and {Aq} 
belong to the Loeliger lattices ensembles (cf. Definition |5] in Appendix lAl-d)'). For stage k = \ of Step B 
in Table HIl as LT — 0°, the set-^*-^-* error probability (cf. Definition SJ, where S^^^ = {l; ■■■,K}, satisfies 



I "^.c^o 



< exp 



-LT 
loge 



(16) 



where O^i) consists of points belonging to the differential ambiguity cosets for 0-MLC O^a" (cf. (fT3])), 
with corresponding messages having set-5'-^'' errors; "E^c^o = {\|/''"^,C^^} is defined right before Lemma 
m the decision region % = {v : |Bd,,v|2 < {KMu +Mr)LT{\ + p) } with filter B^,, defined as in (fTTI) and 
(3 > 0; and the rate constraint R'inG^^ht '^^^^) is defined as in (fT5l) . 

The proof of Lemma [H is given in Appendix |A] The main difficulty is that the cosets (D^"'^ is not a 
direct product of ^ + 1 lattices as in ifTOl , so the methods in [flOl and [flTl cannot be directly applied to 
counting the number of points of O^fl) in the decision region ^ in the second inequality of (fT6l ). We 
avoid explicitly counting points in O^fi) by developing new upper-bounds as in (|26l ) and (1271 ) in Appendix 
\K[ Otherwise, naively applying the methods of filOl and [fTTI will result in rates as in (fT6] ) but without 
the factor (2^K+\i-T _ cancelling out 2^'^+^^^, and lead to significant rate loss compared with our (fT5t 
with S = S^^^ since Rk+i = Y4=\Ri is required to ensure the one-to-one mapping. 

With the results for the first stage k = I m Lemma [T] we show by induction that after the candidate 
generation process in Step B of Table |lll among all "super"-message w,^^'^'* at stage K (defined in Step C), 
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there exists a correct one almost surely (with probability 1) as T — t- oo. To do this, we will show that for 
stage k, with at least k—\ (almost surely) correctly decoded users from the previous stage, almost surely 
there exists one node {k, j[) having at least k users correctly decoded. Note that for stage k, conditioned 
on the event that all decoded users' messages from the previous stages are correct, the noise n^^, in Q 
may no longer be Gaussian [|2T]| . However, under the condition (fTSl ). the probability pi*"* that there exists 
no node at stage k having at least k users correctly decoded can be shown to still satisfy 

Pi^) < |H,,) + P^iS^''^'^^ |H,,, ^ 0, (17) 

s=2 

as LT — )■ oo, where {S^'^'-'''^lldst , Sp^'''^) is defined under Gaussian n^^ and will be given below and 
([17] a) follows from J2l]- Then our claim for stage K is valid and Pg — t- by induction. Since under 
(US]), as LT oo, /'^(5(i)|Hj,,) ^ from we will show that (5(''^'^)|Hrf,,,4''''^) O^s < k, in 
this setting to validate (flTlb). Now we introduce the definition of P^ {S^^'-^'^^Hdst: Sp as follows. Let 
Sp be the set of 5 — 1 previous users along the path starting from the root node to node {s,j'^), the j^.-th 
node at stage s, in the decoding tree shown in Fig. |2l Also, let the set ^(-^'-^i) be {1, . . . ,K}\Sp'''''^ ■ Then 
P^{S^'^'-'''^\lldstTSp is defined as the probability that there exists w with set-J^'^''') error (Definition 
IH) conditioned on the event that all users in Sp'''"'^ are correct (the existence of j'^ is guaranteed by the 
assumption of induction pj;^ — )■ 0, 1 < s < k), and iidst in © is conditionally Gaussian. For this kind 
of error events, minceo(w) — niiiiceo(w,)M'''^''''n'^) with M('^'-'')(c) defined on the right-hand side 

(RHS) of (T2.1) in Table [Hi As in the proof for Lemma [T] in Appendix lAl we can similarly upper-bound 
/'G^j(.9,70|Hrf^^,4"^'^) by the RHS of ^ with J^^) replaced by S^''^'^^ Thus if the transmission rates Rj 
satisfy ([B]), as T ^ oo, we have that P^{S^'-^'''^\lldst,Sp'-'''^) -> 0, which verifies ^h). This validates our 
claim for stage K. 

For the Step C of Table HIl we will use the following suboptimal decoder instead of the optimal 

decoder in Table HI] to prove that we can find the correct message almost surely. First, we compare 

candidates w^^'^^ and wj^'^'', and form the set of users Sc so that for any i e Sc, w|^'''' and w^*''^'' 

have a common message for user i. Then we compare the "coset"-distances min , (a: D^'^' Vc) and 

" ^ ceo(w; ■ ') ^ ' 

min , rjf2)^ D'^^' Vc) of these two candidates and choose the one with smaller "coset"-distance (if equal, 
ceo(w, ' ') \ ' V -I 

we randomly select one), where D*^^''^(c) is formed by replacing Sp^'''^ vvith Sc in M'^^'-')(c) in (T2.1) of Table 
HI] (also the corresponding parameters). We then compare the chosen candidate with the next candidate 
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(K 31 

Wf , and so on. After ^! — 1 comparisons among total K\ candidates, the final chosen candidate in the 
final comparison will be declared as the decoded message. Now we show that the error probability of 
the above sub-optimal decoder will approach zero. As in ([TTI a), this error probability is upper-bounded 

by P'^^^ +Pf{y/\^'^\v/\'^'^'''^ = Wf), where P^^^ is defined before ^ and (wJ^^'-'VI^'^''^ = Wr) is the 

(K i) 

probability that the sub-optimal decoder outputs incorrect \v) ^ conditioned on the event that there 

is one correct candidate y/\^'^^^ = Wf and the noise n^^, is Gaussian. Since pf^"^ — )■ according to the 

previous paragraph, we will show Pe{'^\^'^^ \v/\^'^'^^ = w?) — )■ and then our proof is complete. Specifically, 

if the decoder output w^'^'-') ^ W/, it will have smaller (or equal) "coset"-distance than that of w*^^'-'^) = w^, 

i.e., min , (/s- D^^' Vc) < min , ,k / \, d('^'-)(c), and now Sr becomes the set of correctly decoded users 
ceo(w; '■'0 w — ceo(w* '■'f') ^ ' 

in w*^^'-'^ since it is the set of users with common messages for both wj^''''' and the correct w*^*-'-^*") = Wf. 

{K i) 

That is, message w) may have set-(5c)'^ error (Definition HJ given that the users in Sc are correct, 
where (Sc)^ = {I, . . . ,K}\Sc- However, from the derivations in the previous paragraph, conditioned on 
the event that users in Sc are correct, the probability of set-{ScY error P^{{Scy \tldstiSc) 0. This is a 
contradiction and /'^(Wf^ ''^|w^^^'''^'' = w^) — t- 0. Thus our suboptimal decoder will always find the correct 
Wf , and this concludes our proof since the optimal decoder in Table HI] will perform even better. ■ 
If only the one-stage coset decoder is used as in [[TOl . we have the following. 

Corollary 1: For the MARC in ^ and the rate region constrained by (flSl) and (fT9l) . which is 
strictly smaller than that in Theorem [T] is achievable by 0-MLC with the one-stage coset decoder in (fTTI) . 
where 




I^/ <^^::G(^^r'^) - +^.)iog|^;^, V5 c {i, (i9) 

The proof can be easily obtained by modifying Lemma [H in which we count all of the points in cosets 
OVa"" instead of only counting those corresponding to the message with set-5^^^ error (Definition lU as 
in of (fT6l) . and follows arguments similar to those used in Theorem [U The details are omitted 

here. Clearly, compared to the rate region in (IT4l i and ([TST i. there are rate loss terms M„|5|logj|| and 
(M„|5| +M^)log|^|±§- in d and ([H, respectively. These losses are zero when |5| = K, and the 
MMSE-GDFE processing for the one-stage coset decoding in (fTTI) is only sum rate optimal. 
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For MS-MLC, we have the following theorem. In this result, in addition to the same rate constraints 
(fT4l) and (fTSi) as in Theorem [H there is an additional rate constraint (|20] ) for MS-MLC which makes the 
achievable rate region smaller than that for 0-MLC. 

Theorem 2: For the MARC in ^ and using MS-MLC and the AT-stage coset decoder in Table III 
the rate region with constraints in ([T4] ) and (fTST i and the following additional constraint (|20] | is achievable, 
where 

I^Rr < ^R'MKst) -M,|5|log '^'^; +^'- +i?^H-i V5 C {1, ...,K}, \S\ > 1. (20) 

When using MS-MLC with one-stage coset decoder in (fTTl) . the rate region with the constraints in (fTST ) 
and (fT9l) and the following additional constraint (1211 is achievable, where 

Proof: Unlike 0-MLC, there is a possibility for MS-MLC that two different users' super-codewords 
are mapped to the same relay codeword from Definition |3] This fact makes the properties exploited in 
Lemma \T\ for the random mapped-codebook ensemble of 0-MLC (for details, please see the proof of 
([28] ) in Appendix lAl) no longer hold for the ensemble for MS-MLC. Thus Lemma [H cannot be applied 
for MS-MLC. We solve this problem by dividing the random mapped-codebook ensemble for MS-MLC 
into two partitions, and the techniques for proving Lemma [T] can be modified to deal with each partition 
separately. The detailed proof is given in Appendix |Bl The rate region for one-stage coset decoder in 
(fTSi) . (fT9l ) and (|2T1) follows by using techniques similar to those used in the proof of Corollary [T] ■ 
The additional rate constraint (1201 ) is due to the ambiguity of the modulo-sum mapper in MS-MLC, 
where there is a rate loss term Mi,\S\log^-^^j^0^ . However, the rate constraint (|20] ) can be negligible and 
even looser than constraint (fTST ), as (^a:+i — M„ | 5 | log ^^^^"i^'^' ) becomes larger by increasing the relay 
codebook size 2^'^+^^^ (which reduces the occurrence of ambiguity). Thus MS-MLC can approach the 
performance of 0-MLC by increasing the complexity. 

Finally, for random slow fading channels, we show that 0-MLC with the one-stage coset decoder (fTTT ) 
is DMT optimal for the DDF MARC, as stated in the following corollary. Despite the rate loss terms in 
(fTSi) and (fT9l) compared with ([14] ) and (fT5l) . respectively, the losses become relatively negligible for the 
DMT analysis when the SNR is high. 



19 

Corollary 2: For the MARC in ([5]) and (|6l), with the one-stage coset decoder (fTTI) . the 0-MLC achieves 
the optimal DDF DMT d{r) of © and respectively, where d{r) is defined in (T1.20) of Table HI 

Sketch of proof : As in [[3]| and [JS], we need to establish the DMT optimality for both the relay and 
destination channels. We focus on the destination channel Q since the DMT-optimality for the relay 
channel © (identical to the MAC channel) has been proved in [[TOl . Following |fT6l and the proof steps 
for ([19] ). we can exponentially upper-bound the error probability Pe{9d) in (T1.20) of Table H] using decoder 
(fTTI) (averaged over random H^^j which satisfy ([T?] )) as 

.dst (jj{S,K+\Y 



Pe{pd)<EH, 



dst 



(1+5). pf^--'exp T±R'j:o{^fJ 

sc{i,...,K},s^<^ 



Pr{0) (22) 



loge 

where 5 > 0, is the received SNR at the destination; r, is the given multiplexing gain for user i as in 
(T1.20); the exponential larger and equal [|20l are denoted as > and =; and O is the outage event when 
Heist does not satisfy (fT5l) . The proof of (|22|) is detailed in Appendix iDl Together with the fact that for 
any coding schemes, Pe{pei)>Pr{0) = p/'^^^ as in [|20ll , we prove that 0-MLC can achieve the optimal 
DMT d{r) for the destination. ■ 
In jHl, a two-user, single antenna node MARC was studied for the symmetric rate case (Ri = R2), 
which showed that the DDF strategy achieves the optimal DMT for the MARC in the low to medium 
multiplexing gain regime. The DMT results of Corollary |2]can be achieved by codebooks, which are more 
structured than that in flS]. Moreover, our designs in the next section also demonstrate that our theoretical 
results can be implemented in practice. 

V. Simulation Results 
In this section, we present numerical examples to illustrate our theoretical results. Performance results 
based on practical decoders are also presented. As mentioned in Section IIII-Cl the lattice decoder in 
IfTOl and [fT6ll fails to be directly applicable to our coset decoder of (fTTI) since only the points in (D^ 
of (fT2l) will be searched. In general, the optimal non-linear relay mapper may make the coset decoders 
very complicated and impractical. To facilitate the coset decoder for the relay mapper, we resort to the 
sub-optimal linear mapper such that the coset decoder of (fTTI ) can be transformed into the efficient lattice 
decoder. For simplicity, we consider the case in which there are two users with the same transmission 
rate, i.e, Ri = R2 — R. 
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Let the code rate of the relay Rt, = 2R, and G,, i= 1,2,3, be the generation matrix of the coding lattice 
(cf. Definition [5]) for transmitter i. Then for user i= 1,2, the codewords are c, = (G/z, modA^,), 
where z, G Z^^"^^. For 0-MLC, with Mr = 2M„, we choose the linear relay mapping such that the relay 
codewords are C3 = (G3Z3 modA^j) with Z3 = [z|^,Z2]^. After some manipulations, it can be verified 
that the decoding equation of (fTTI) is transformed into 

z = arg min \Fdstydst + (^dstU - B^^f Gz) 1^ (23) 
where n = SMuLT. Then for the linear one-to-one relay mapper, we have 



G = (i/ag(Gi,G2,G3) 



^ 




(24) 



hMuLT 

R 

For the linear modulo-sum relay mapper, with = Mr, we choose the linear relay mapping such that 
Z3 = zi +Z2 and the corresponding G can be similarly derived. Note now that the decoder searches the 
whole integer vector plane Z" in (|23l) . thus the lattice decoder using the efficient sphere decoding algorithm 
[El, [ini can be applied . 

In the following simulation results, the number of slots is selected as L = 2, and the sum rate +R2), 
is 4 BPCU. The relay forwards the message only when the users' messages are correctly decoded. All the 
channel links are Rayleigh faded and unless otherwise specified, the sources-to-relay (S-R) channel link 
is 10 dB better than the other channel links. In Fig. |3l for single- antenna nodes, we show that 0-MLC 
has better error performance than that of MS-MLC and both outperform the protocols of [|51, [|9l, [fT3l and 
lfT4l in terms of outage probability and achieve the diversity min{M„(M^ + A'^), {Mu+Mr)N} as expected. 
In Fig. in for the cases M„ = = 1,A/^ = 2 and M„ = = 1,A^ = 2 (where the S-R link is 15 dB better 
than the other channel links), respectively, we show that our proposed coding schemes outperform the 
MAF. For the former case, the MAF achieves a diversity of only 2 instead of 3. Note the methods in 
|fT3l and lfT4ll cannot be straightforwardly extended to the case of multiple- antenna nodes. 

For the simulation of practical lattice codings based on one-stage practical decoder and linear relay 
mapper, with the slot length T = 2, we use the pair of self-similar randomly generated nested lattices 
drawn from the lattice ensemble defined in Definition |5l For the settings the same as the above, in Fig. 
|5l the block error rate for 0-MLC and MS-MLC are presented. The parameters of the linear codes in the 
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Fig. 3. The outage probability for O-MLC (E), l[T5) and MS-MLC Gil, GD, & vs. the protocols in l5l. (9l. fTSl and lfT4l. 



lattice ensemble for O-MLC and MS-MLC are {pi,ki) = (97,3), (47,3), V? (cf. Definition [5]), respectively. 
The diversity of 3 for each user is achieved as expected using our finite T code construction. 

VL Conclusion 

In this work, we have proposed O-MLC and MS-MLC for structured MARC coding. The former enjoys 
better error performance, while the latter provides more flexibility to tradeoff between the complexity and 
the error performance. The error performance of MS-MLC can approach that of O-MLC by increasing 
the complexity. We have shown that with the new ^-stage decoding instead of the one-stage decoding 
considered in previous works, the structured O-MLC can approach the rate performance of unstructured 
codebook with ML decoding. When only the one-stage decoder is used, O-MLC can still achieve the 
optimal DMT of DDF. Besides the theoretical results, we have also considered the design of practical 
short length lattice code with linear mapping, which facilitates the efficient lattice decoding. Simulation 
results have shown that our proposed coding schemes outperform existing schemes in terms of outage 
probabilities. 
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Fig. 4. The outage probability for O-MLC lE), ([B) and MS-MLC Gil, ^ vs. MAF (5). 

Appendix 

A. Proof of Lemma |7] 

(I) Some useful definitions : Here we introduce some notation for simplification. We denote the nesting 
ratio in Definition [T] as X/ = 2^'/^^« while the dimensions of the lattice code are n, = IM^LT, (1 <i <K). 
The corresponding parameters for the relay are Xk+i and hk+i, respectively. We also have the following 
definitions. 

Definition 5 (Loeliger lattices ensemble /fTTI/)." Let Ac, be a lattice generated by a linear code Cf" as 
= {z e Z"' : Ip. G C\'"}, where Zp,. is obtained by applying the componentwise reduction modulo pi 
operation on z IfTTl and the {ni^ki) linear code C^" is defined over the finite field Tj"^. ((Tl.l) in Table 
HI). The Loeliger lattices ensemble is the lattices ensemble {Aq = (YiA^-lo) : Cf" G d^LoeiJi £ '^}, where 
Ci^Loe is a balanced set of linear codes [[T71 . In our analysis, we let pi — t- oo, and — )• such that the 
fundamental volume of Aq ((T1.3) in Table U) ^/(Ac,) = pl'^^'H' is fixed. 

The following balanced set definition generalizes the balanced set defined in ifTTll . 

Definition 6: (Balanced set for the K-user MARC): Let C be the set of c where c = Ci x ■ ■ ■ x ca:+i G 
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Fig. 5. Comparison of theoretical outage probabilities and the block error probabilities using practical linear relay mapping, l llSb . il9l for 
O-MLC and ^9^, ^ for MS-MLC . 



]R"i X ■ ■ ■ X ]R"^+i, and Ce be the finite set of C (e.g., c is a codeword of a codebook C, and Ce is a 
codebook ensemble). We collect all non-zero c in C of Ce as ((T^J* = {c e M" : c G C*,C G Ce}, where 
n = ni, C* = C\ {0}. The set Ce is called balanced if every nonzero element c in {CceT is contained 
in the same number, denoted by A^^, of C from Ce- We refer to A^^ as the balanced number. 

(II) Proof: Here we show the proof only for the second inequality of (fT6] ) since the proof for the first one 
is similar to that in [[TOl . An outline of the proof is provided first to provide insight into how to solve the 
problem that cosets with set-5^^^ errors Oj^i, = {d G O^a"" : d/ 7^ 0, Vz G S^^^} (or even cosets O^a ' in (fTSl) ) 
is not a direct product of ^ + 1 lattices, where the differential coset leader d,- for user i is defined below 
(fT3] ) with (T1.4) and (Tl.lO) in Table HI First, by averaging over the ensemble of mappers, and judiciously 
using the balanced set property in Definition |6l we can upper bound 1^ ^ 1 L(\t>""^c,^°)£'E lo ^cfi) 



in (fT6l) using the RHS of ([27] b) below. Note that instead of summation over cosets O^fi) as in the RHS 
of (l26l) below, in the RHS of (l27]b) the summation is over the lattice points of set (Ac,,J* in (|29l ) below, 
which makes further upper-bounding possible. By taking the limits, we conclude our proof. 
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Now we give the details to show the second inequality of (fT6l) . First we introduce some useful notation 
for the upcoming (|25] a). The differential mapper V]/^"*^, which corresponds to \\r'-'"'' in Definition |2l is 
defined by replacing the super-codeword c = [(Cj,)^, (c^)^]^ in \\f""^ with the differential super-codeword 
d(w) in (fT3l) . as in (T1.14) of Table HI Let 'E^^j^lo be the ensemble corresponding to l^^c^ in (fT6l) . but 
with one-to-one mappers V]/^'"^ replaced by the corresponding differential mappers V]/^'"^. Also let /(■) be 
the indicator function where /(d) = 1 if d G otherwise /(d) = 0. Clearly the following (|25]a) is valid 
for the left-hand side (LHS) of the second inequality of (fT6l ) since I^^^cloI = I'Z^^ f-iol, 



(J y y /(d) g (T^+l)"^^'n,-,5C)2^'"" teV-/(^)^^ (25) 

As for the above ([25] b). it can be proved from the RHS of the upcoming (ITTIb) by averaging over Cioe 
using techniques similar to those in [[TOl and [fTTl . Thus we focus on the proof of (ITTIb) below. As pointed 
out in the beginning of this appendix, our trick to prove this critical step is replacing the summation over 
the "non-lattice" cosets O f^. in the LHS of (|25|b) with the set (Ac„J* in (|27|b) by showing 



1 ^ 



E /w^^ E fiT^ E E /(d) 



(26) 



5(1) ''^^^(l) 



£ £ /(d), (27) 



where the derivation of each step comes as follows: 

For (|26] |. we first define (r\|/^,E as the ensemble of all mapped nested-codebooks (differential) C^fne given 
a particular super Loeliger linear code C\fy (T1.6), with codewords of C^fne satisfying the mapping rules 
of the corresponding V]/^"*^. Note that all C^fne g Cm^^^ are based on the same C^^\ but with different 
mappers. Also let Choe = C\,Loe x ■■■ x CK+\,Loe be the ensemble of all possible C\f^ with 0,Loe given in 
Definition |5l and m"^^ be the ensemble of all possible differential mappers. Then (|26l ) is obtained by 
\'^^,c^ \ = \ C^^^,'E\\CLoe\ by definition. 

For (|T7l a), given mapper \|/^"^ and Loeglier linear code C'^;" (thus mapped-codebook C^^'if), we rewrite 
the set-5(i) error cosets as Ojfi, = {d G Ac,,,- ■ d G C^^.e.di ^ 0, Vz G 5^^^}, where Ac„, is in (T1.9), and 
set-J*-^) errors is defined in Definition 51 Then the term inside the parentheses on the LHS of (l27la) comes 
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from 

I I m) = -^{N, £ /(d)) (28) 

5(1) 

where we collect all points belonging to cosets O^fl) over all possible mapped codebooks C^ne as 
d e Ac,,, : d e Ojfi, ,C^- e (T^^.e}- For (|28]), it comes from the fact that C^^^e is a balanced 
set as follows, where (Cc^^j,)* is the collection of non-zero codewords in C\^^,e, by setting {Ccj,)* in 
Definition [6] with Ce = G|fA,E ((T1.15) in Table H]). Consider two different vectors c and c' belonging to 
e)* - ^^^h mapped-codebook Cy'^u' e Cx^^^e containing c but not c', with the corresponding mapper 
vi/^'"^, we can easily form another C(\j;^"ey G C\^i^.e containing c' by forming a new one-to-one mapper (Ya"^)' 
from \|/^"^. Therefore, c and c' are symmetric, and thus each vector in (Cc^^^^)* is contained in equal 
number, denoted by A^/„ of Cyme from G|/^,e- Then G\i^.e is a balanced set as in Definition [6l Together 
with the fact that (Cc^^j,)* is the set of coset leaders of (AcJ"", that is, (Cc^^j,)* = {d : d G (Ac„J*}, then 
(|28] ) follows. Finally, with (xa:+i)"^+i being the relay codebook size, since the differential mapper \^'^^ 
is one-to-one, each nonzero user codeword can possibly be mapped to (xa:+i)"^+i — 1 relay codewords. 
Also the mapped nested-codebook ensemble G|/^,e is a balanced set with balanced number A^^, we have 
that |CVA,E|/A^fo = (Xif+i)"^+i - 1. Then we obtain ([27] a) from (|28T). 

For (l27lb). we define (Ac„J* formed from the super coding-lattice A^^^ ((T1.9) in Table U) as 

(Ac,„r = {d e Ac,,, : d, ^ 0, V/ e S^'^] . (29) 

From the definition of (Ac,,,.)* right after (|28T ), we have (Ac,,,.)* C (Ac,,,)*. Together with the fact that the 
indicator function /(■), defined right before (|25T ), is a nonnegative function, (|27lb) is obtained. 

Finally, the second inequality of (fT6l) can be obtained from (llSlb) by following steps similar to those in 
ifTOl and lfT6ll . The key observation is that as T — )• 0°, the shaping lattices A5, from Definitions [H and |5] will 
be good for minimum square error quantization [i22| . so that their Voronoi regions Vf{Asi) will make the 
signal behave like an optimal Gaussian signal. Thus the term jjlog/ yK+\ f{d)dd/ Ylj^j^w K+n^fi-^Si) 
in (|25]b) will approach -7^ ^f^/clH^f'^ ''^'^) in (HH). With (xj^+i)"^+i = 2^k+xLT defined in Appendix 
|Al-(I), we then have the second inequality of (fT6l) . The details are given in Appendix O 
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B. Proof of the rate region of the K-stage MS-MLC in Theorem [21 

The proof for the rate region of MS-MLC is similar to the proof of Theorem [T] Here we show only the 
principal difference, which results from the fact that the balanced set structure exploited in Appendix |A] 
(to obtain (l28l) ) is no longer valid for MS-MLC. We solve this problem by introducing a new 2-partition 
balanced set defined in Definition |7] below. Specifically, we will show a counterpart of (|25T l for the first 
stage as follows: For MS-MLC, with the relay-mapper and linear-code ensemble I^c^o of {\\r'™"^ ,C^^"} 
and j(^) = {1, ■■■,K}, we have 



•...mod 



(30) 



which, compared with (|25l) . has an additional term (second term) in the RHS, (l30l) where we let d^(i) = 

[d;^,...,df ^ ]^, z'l < ••■ < /j^(i)|,Vzy G 5^^^, and the indicator function /'^*''(d^(i)) = 1 if d^(i) G 'K^^^K 

with <K^^'^ = |v^(i) gM^I^*''!^"^^ : vG %,v, = 0,V/G {{1, 1} and the decision region % 

given in Lemma \T\ This additional term results in the additional rate constraint (|20] ) compared with 

Theorem [U Similar to the derivations of (|25la) and (|26l l. the LHS of (|30l ) equals 

/ \ 



1 



1 



la, 



1 



v 



I I /(d) 



...mod 



5(1) 



(31) 



Compared with (|26|) . only the (differential) one-to-one mapper \\f'^^ is replaced by xj;^"^'^ in (|3T1) . However, 
unlike 0-MLC in Appendix |Al now G|/^,e is not a balanced set, which makes simplifying (|3T1) more 
difficult compared with (|27] a). To solve this problem, we need to extend Definition [6] as follows. 

Definition 7: (2-partition balanced set): Following the notation in Definition [6l we say that the set 
Ce is 2-partition balanced if the non-zero vector set (Qj.)* can be partitioned as {CceT = {^Ce i' ^Ce 2 J'' 
where every element in Q.^ j is contained in the same number, denoted by A^/^j, of C from Ce while every 
element in Q.^ j is also contained in the same number, denoted by of C from Ce- 

For simplifying the RHS of (|3T1) , now we explore the properties of the ensemble C^^.e using the 
2-partition balanced set in Definition Ul Recall that G|/a,e is the ensemble of all mapper-codebooks 
(differential) C^mod with mapper \\f'2"'' G ^2"^^ where the differential super-codewords in Cynod satisfy the 
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mapping rules of \^'™'^ . For any user set 5 C {1, ...,^}, let C^^ g be the set of mapper-codebooks formed 
by collecting every codebook belonging to Gj/^.e, but excluding codewords d ^ (Ds where (Ds = {d: d, 7^ 
0,V?' G S}. The fact that for every user set S, C^^e ^ 2-partition balanced set in Definition |7] follows 
from the following observations. According to whether the differential codewords of the relay = or 
not, we can categorize them into two partitions. The differential codewords in each partition are symmetric 
according to the proof in Appendix lAl Note that d^ = occurs only in the MS-MLC due to the modulo-sum 
operation in Definition [3l In 0-MLC, the one-to-one mapper guarantees d^ 7^ 0, and results in simpler (|25T ) 
compared with our target (|30^ . Now for the first stage, we set S = S^^^ and the two partitions of g can 
be formed as follows. Let C* and C* ^^i) be the codeword partitions corresponding to Q-^ j and C^^ 2 
in Definition E] with Ce = qj^'g respectively, where r^,,) ={de C* : d, ^ 0,3 G ^(i),^^"' G ^7^}, 

'^Va-E'I _ ^ 

where the codewords of the relay are distinguishable since d^ 7^ 0; C* (d is defined similarly but with 
d,- 7^ replaced by d^ = 0. Also let the corresponding balanced numbers of Q-^ j and C^^ 2 ^^,1 
Nb,2 respectively. Now we can simplify the RHS of (|3TI) using the aforementioned 2-partition balanced 
set property and following the proof of the 0-MLC counterpart (l28l) . while in (l28l) C\^i^,e is a balanced 
set. Corresponding to (|28T ). the term inside the parentheses on the RHS of ( f3T1) now equals 

E E E E fw 02) 

where (A^:-^^^..!)* and (Ac„^ 2)* are the lattice codeword sets for the 2-partitions corresponding to (A^^^J*" 



in (1281) . respectively. 

Unfortunately, the balanced numbers in d32] ) cannot be easily computed as in the proof of (IT71 a) and 
vary with g for different sets S. Thus we alternatively show two upper-bounds as 

— 7 ^ 7' ^'^d TTT^ — 1 — 7 \ 7' (33) 



where (xa:+i)"'^+' is the relay codebook size from Definition [Tl Then following similar arguments as 
those used in proving (IT71 a), (l27l b) and (l25l b) (steps after (1281) ). we can prove (1301 ) from (|32] ) and 
(l33l) with the details omitted. For proving (|33T ). we start with the single user case where l^^^^l = 1 
(5(i) = {l,...,i^} = {l} when the number of users ^ = 1), and then extend to the case 1^(1) I =2 as the 
upcoming (|34|) and (l35l) . By repeating this procedure recursively, we obtain the formulation of balanced 
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numbers in (|33] ) as (|36l ) in the next paragraph and then derive the upper-bound. When l^*^'-*] = 1, G|/a,e is a 
balanced set, and thus balanced numbers (normalized) are given by {Nb,i/\C\^i^,E\)^^(i)^^i = {(zk+i)"k+i - i) 
from the proof of (IT71 a), and (A^fe,2/|^A|/A,E|)[^(i)|^i = by definition. Here the subscript l^*-^-*] = 1 is 
added to the notation of the normalized balanced numbers to represent the upcoming (|34l) and (|35] ). For 
l^*-^-*! = 2, the corresponding balanced number for the partition with = is 

^ ' W-'-D^T^I |. (34) 



|G|/a,e|/ |5(i)|=2 ((^3)"3-l) y Vl^^¥A,El7|5(i)j=i 

To show (|34|) . we count the occurrence of a particular super-codeword (differential) di x d2 x (d,- = 0) 
in the overall two-user mapped-codebook ensemble C\^^,e, where user Ts codeword (coset leader) is 
d/, i = 1,2. Let \|/™'^(d,) be the (differential) mapper corresponding to user i as in Definition [3l First, we 
compute -7 — Ll;5(^lti From Definition [6l given a particular di x d^^ with di 7^ such that d^,i = \\f'l"f{di), 

r''-Oij(i)|=i 

there will be (A'/, 1) possible mappers Ya^i- Now from Definition [3l for this partition to have 
d,- = ^4=1 ^Afi^i) = 0, the mappers corresponding to user 2 must satisfy (d^j +\|/^' 2*^(32) ) mod As^ = 
since d,-i = \\r2"f{di). Thus for a fixed d^^i, the vector x^^^ji^^) for the given d2 is also fixed from 
the definition of the codomain C"*^'^' of Va2^(') gi^^n in Definition [2l Also from Definition HI d2 7^ 
since the user messages (encoded in cosets) are with set-^^^-* errors, where S^^^ = {1,2}. Then for a 
fixed d^.i, excluding the given d2 and the zero vector, by assigning the mapping rules for the remaining 
(X2)"^— 2 points in the domain of \j/^2 (')' there are n/=2 ii'^^)"^ ~ ^) possible injective mappers 
^A^i where (x,)"' is transmitter Ts (users and relay) codebook size. Note that to make (\|/^'2^(d2) +d^,i) 
modA5^ = 0, it is required that d^i ^ since d2 7^ 0. As there are a total of {{13)"^ — 1) possible 
d,.i ^ in the relay's (differential) codebook, we have l^''''"' = ((xs)"^ - l)ul=r^\i^^T' - 

K0|5(l)| = l 

|C eI 1 « 
Also = (('^3)"^ ~ l)nl=2 ^^(('^3)"^ ~0 by counting all possible injective mappers x^/^"^^ of 

user 2. Thus (|34|) is valid. Similar to (|34] ). for \S^^^ = 2, the corresponding balanced number for the 



partition with d^ 7^ is 



"'■'^ ' 'm--2)(^) |. (35) 



|G|fA,E|/ |5(i)|=2 ((^3)"^-!) \ VI<^Va,e|/ |5(i)| = i vI'^v|/a,e|/ |5(i)[=i 

The proof of (|35]) is similar to that for dM]), but now {^^"2 {^2) + dr.i) mod As^ 7^ 0. The first term in 
the parenthesis on the RHS of (l35l) corresponds to the case d^j 7^ while the second term corresponds 
to the case d^j = 0. The details are omitted. 
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Finally, by repeating the arguments in the previous paragraph we can find the balanced numbers for 
= 3 with dH and dSS]), and so on. Then for the balanced numbers when = ^ , we have 

|(^Va,e| (X/f+i)"'^+i((X/f+i)«^+i - l)l^'"l 
Nb,2 1 

|<:va,eI (xj^+i)"^+i((x/f+i)«^+i - 1)!-^'"! 
where (x^+i)"^+i is the relay codebook size. On noting that ^^^^^^^yK+i^i)\sW\ - ((Xj^+iyg+'-i) 
1, together with (|36l) . one can show that (|33l) is valid. Then our proof for (|30] ) is complete. 



((x/^+i)"^+>-l)i'^^-'i + (-l)i 
(((x^+i)"^+> - l)!-^'"! + ((x^+i)"^+> - l)(-l)''^"'l) , (36) 



C. Proof of ^b) and ^ 

We can rewrite (|27l l as 

1 



I /(7z) 



ze(Z")*:Zp=a 



(37) 



(38) 



Ci-°ea<,.ae(C;^,")* 

where 72 is defined in (T1.23) in Table H In dH, we define (Z")* = {z e Z" : z,- 7^ 0,V/ G {1, ...,^+ 1}} 
and Zp is formed by applying modulo pi operation on elements of z, ((T1.22) in Table U)- Now for 
summation in the second term of (|38T ). we separate the summation over a G {C^,")* by the cases {a„ 7^ 
0,a^ = 0}, {a,- ^ 0,a,, = 0} and {a^ ^ 0,a„ 7^ 0}. By averaging over Ctoe for these three cases, we have 
(l39l) . (I40l) and (gl]), respectively, 
1 



\Cu> 



ze(Z")*:zp=a 



5c{i,...,^:},57^(^ 



U,es{P?-i) 



+ 



(pr'-i) 



ze(Z")*:(zp)/7^0,/e5,(zp),.,=0,;'e{5'',^:+l} 



/(7z) 



5c{i,...,a:},S7^(|) 



ze(Z«)*:(zp)^+i7^0,(z^),.,=0,;-'e5(i) 
lLe{5,/f+l}lP,- ze(Z")*:(zp)/7^0,(e{5,/f+l},(zp),.,=0,/'e5^ 



/(d)^id 



((x/f+i)«^+>-i)n,,{5('),if+i}^KAc,o 



(39) 

(40) 

(41) 
(42) 
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as Pi — CO, y, — )■ (Definition [5]). Since / has a bounded support (/ vanishes at infinity), with the definition 
of (Z")*, the first term in (|38T ) vanishes for sufficiently large y/p, — )■ co as shown in ifTOl . [fTTl . The temis in 
(|39| ) also vanish by noting that at least one of elements of zk+\ is equal to the multiples of pk+i, which 
results in /(yz) — )■ in (|39| ). The term in (|40|) follows similarly. Finally, the temi in (HTI) approaches to 
((42)) for 5 = 5*^^^, and vanish otherwise in a way similar to dll]), (@0]), as y,- with p'l'^^'i}' = V/(Ac,) 
fixed as in those [lOJ. From Definition [H Vf{Asi)/Vf{AQ) = l'^'^'^ = (x/)"', then ([25] b) can be obtained 
from (|42l) . Finally, (fT6] ) can be obtained from (|25]b) by following the footsteps in |fT6ll . 

D. Proof of ^ 

Proof: For the K users, we use the self-similar nested lattice (Definition [T]) where A5^. = x,Ac;, 
X/ = [pJ^^J in order to satisfy the transmission rate constraint Ri{Pd) = rdogpd- The ensemble "E^^c^o 
defined in the proof of Theorem \T\ with kj = 1 (Definition [S]) is considered, on which the corresponding 
lattices ensemble is then expurgated in a way similar to that in the proof of Theorem 6 in [[T6l . We denote 
the expurgated ensemble of codebooks, Ccode (i-C-j Cy^^e given the corresponding lattices in the expurgated 
lattices ensemble), as (^^de- Then the average error probability, Pe{pd) in (T1.20) of TableU can be upper 
bounded by 

PeiPd) = E^xp [Pr{Er\Ccode, ^dst)] < Pr{ O) + E^^p [Pr{Er, O^lQ^^,, H^,,)] (43) 

where Pr{Er\Ccode-,^dst) is the probability of the event that given a {Ccode^^dst}, not all users are correctly 
decoded at the destination and O denotes for the outage event set of ^dst O^dst does not satisfy (fTSl)). 
For the second term on the RHS of the inequality in (|43T ), by averaging the term Pr{Er, C-'lCcode^Hdst) 
over Ccode £ C^^ide ^^'^ ^^^^ ^^^^ ^dst ^ O'^ , wc wiU show 

En,„ [Pr{Er, 0'\Hdst)] =Pr{ O) (44) 

Following the steps similar to those in ^T0\ and [[T6]| . considering a tuple of multiplexing gains, r,, to 
meet a diversity requirement d for each user as in [|20l , given a Hdst, we have, 

^K+\-^ V5C{l,...,/f},57^<]) \\:i\Mu^!yirJ 

■ aet I l2(|s|M„+M,)Lr + j I 



(45) 
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where pi — )• oo^ V« and 5 > 0. 

Let Pr{0) = p^'^^^^ ■ Although Pr{0) does not necessarily guarantee the minimum outage probability, it 
suffices for the DMT analysis as indicated in [23] . The explicit formulation of d{r) is generally difficult to 
obtain since the joint probability density function (pdf) of eigenvalues of (H^^^^*'^^^)^H^'^^'^^*^ is generally 
not easy to evaluate. However, from Theorem 3.2.17 of [|24l . it can be seen that the joint pdf of these 
eigenvalues is a continuous function. Therefore, by choosing a sufficiently large, but finite T, such that 
the term on the RHS in (|45] ) decays fast enough, we can prove that ^h^j, [Pr{Er, 0*^111^5/)] is exponentially 
equal to Pr{0) using the techniques similar to those in IfTOl , [[T6l , [|20l and [|23l . Together with (|43] ). we 
obtain (l22l) . Note the rate loss terms in (|45T ) are exponentially negligible (independent of pj) in the DMT 
analysis. 
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